Unified input API: artifact_id + data replaces input_csv#207
Conversation
Processing tools now accept artifact_id (UUID from upload_data) or data (list[dict]) instead of input_csv/input_data/input_json. Adds upload_data tool for URL/file ingestion and request_upload_url for presigned large-file uploads in HTTP mode. Phase 1: Simplified _SingleSourceInput and MergeInput models Phase 2: upload_data tool (URL + local path + Google Sheets) Phase 3: Presigned URL upload system (HMAC, Redis metadata, REST endpoint) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Adopt _aid_or_dataframe and _input_data_mode properties on models, removing free functions from tools.py - Add left/right properties to MergeInput - Use `is not None` for artifact_id checks - Fix sync context manager in uploads.py (with → async with) - Remove duplicate max_inline_rows in config.py - Update integration tests to use data instead of removed input_csv - Reject empty CSV in upload_data local file path Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- /pub URLs now correctly convert to /export?format=csv - Headers-only CSV from URL raises clear "empty CSV" error instead of misleading "could not parse as CSV or JSON" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove auto-generated per-process secret — it breaks when pods don't share state. Now fails fast with a clear error if unset. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add len(v) == 0 check to _SingleSourceInput and MergeInput validate_data_size validators to prevent wasteful zero-row task submissions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@claude code review |
|
Claude finished @RafaelPo's task in 2m 12s —— View job Code Review: Unified Input API v2
Overall this is a well-structured refactor. The 1. SSRF risk in
|
- Fix misleading upload_secret description: say "Required in HTTP mode" instead of "Auto-generated if empty" (it raises RuntimeError) - Remove unused load_data() and its 8 tests — superseded by model properties (_aid_or_dataframe) and upload_data tool - Rename _is_url → is_url since it's imported across modules Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a task is submitted via artifact_id (not inline data), the row count is unknown client-side. The fallback labels now say "artifact" instead of the vague "rows for X" to make clear that the server-side artifact is being processed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
||
| if is_url(params.source): | ||
| df = await fetch_csv_from_url(params.source) | ||
| else: |
There was a problem hiding this comment.
Bug: The everyrow_upload_data function lacks error handling for malformed local CSV files, which will cause an unhandled exception during parsing with pandas.
Severity: MEDIUM
Suggested Fix
Wrap the pd.read_csv(params.source) call within a try-except block. Catch potential exceptions from pandas, such as ParserError, and return a user-friendly error message, similar to the error handling implemented for the REST endpoint in uploads.py.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: everyrow-mcp/src/everyrow_mcp/tools.py#L626
Potential issue: The `everyrow_upload_data` function does not handle potential errors
when parsing a local CSV file using `pd.read_csv`. While the existence of the file is
checked, the content is not validated. If the file is malformed or not a valid CSV,
`pd.read_csv` will raise an unhandled exception, which will propagate up and cause the
tool to fail. This is inconsistent with other parts of the application, such as the REST
endpoint and the URL-based upload functionality, which both include explicit error
handling for CSV parsing.
Summary
input_csvwith a unifiedartifact_id+datainput API across all MCP tools (screen, rank, dedupe, merge, agent, single_agent)uploads.pywith HMAC-signed upload URL support for multi-pod deployments/pubURL normalizationdata=[]) in validators to prevent zero-row task submissionsTest plan
/pub,/edit, and export URLs🤖 Generated with Claude Code